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REMARKS 

This is in response to the Office Action mailed October 22, 2002, in the above-referenced 
application. The rejections of rejection are addressed below in the order presented in the Office 
Action. 

Claims 1-6 remain rejected and Claims 1 1 and 12 are newly rejected imder 35 USC 
§ 1 12, second paragraph, as indefinite. Applicants submit that the foregoing amendments 
obviate the indefiniteness rejection. 

Claim 1 states that the artificial antigen consists of a recombinant or synthetic 
polypeptide having at least one citrulline residue. Claim 1 further states that the polypeptide is 
derived from a filaggrin unit as represented by SEQ ID NO: 7 or a fi^agment thereof that includes 
at least five consecutive amino acids comprising at least one arginine residue. A paper copy a 
substitute sequence listing, including SEQ ID NO: 7, in enclosed herewith and a computer 
readable form thereof will follow shortly. 

Applicants maintain that the sequence of human filaggrin is well known to one skilled in 
the art and accordingly there is no need to identify such a sequence. However, to advance 
prosecution of this matter, SEQ ID NO: 7 is presented herewith to provide the consensus 
sequence of human filaggrin as published by Gan et al., referred to on page 14 of the present 
application. A copy of the Gan et al. article is also enclosed herewith. 

With regard to the language "derived from," Applicants also maintain that this language 
is not indefinite because the claims clearly recite that at least one arginine residue is replaced 
with a citrulline residue. Further, Claim 1 now refers to SEQ ID NO: 7, a consensus sequence of 
hmnan filaggrin, thereby obviating the Examiner's argument that the amino acid residues can be 
of any sequence and not actually fi-om filaggrin. 

Claim 2 is amended to refer to SEQ ID NO: 7. This also obviates the Office's concern 
regarding fi-agment 144 to 314 and fi-agment 76 to 144. Claim 3 is similarly amended to refer to 
fi-agment 71-1 19 of SEQ ID NO: 7. Claim 4 is amended to state that the fi-agment is selected 
from the peptides identified by SEQ ID NO: 3, 5, and 6. Claim 6 is amended to delete the 
language objected to by the Office. 



In re: Guy Serre et al. 
Appl.No.: 09/254,032 
Filed: April 26, 1999 
Page 5 

In view of the foregoing, Applicants respectfully request withdrawal of the indefiniteness 
rejection. 

Claims 1, 5, and 6 remain rejected under 35 USC § 102(b) as anticipated by Simon et al. 
Applicants respectfully traverse this rejection. 

As explained in Applicants' prior response, naturally occxirring human filaggrin does not 
consist of a single polypeptide. Rather, naturally occurring human filaggrin includes a 
population of polypeptides of different sequences since it is synthesized as a large precursor 
(profilaggrin) comprising filaggrin units displaying important variations between them. For 
reference please see the enclosed Gan et al. article referred to above. 

In contrast, when a recombinant or synthetic filaggrin or filaggrin fi-agment is prepared, it 
is obtained fi-om the sequence of an individual filaggrin unit. This results in a population of 
polypeptides having the same sequence. 

Accordingly, the antigen of Simon et al., which includes a mixture of polypeptides 
derived fi-om filaggrin units of different sequences, is clearly different from the antigen of Claim 
1, which is derived from a single filaggrin unit. Stated differently, the antigen of Claim 1 is a 
homogenous preparation resulting from the citruUination of a recombinant or synthetic filaggrin 
or filaggrin fragment. Accordingly, the antigen includes polypeptides having the same sequence, 
and does not include a mixture or polypeptides of different sequences. 

Claim 6 is fiirther removed from Simon et al. because it explicitly excludes preparations 
having the characteristics of the antigen of Simon et al:, i.e. comprising a mixture of isoforms of 
filaggrin having a molecular weight of 40,000 and a pl ranging between 5.8 and 7.4. 

Accordingly, Applicants respectfiiUy submit that the claimed invention is not anticipated 
by Simon et al. and request withdrawal of this rejection. 

Claims 6, 1 1 and 12 are rejected xmder 35 USC § 102(e) as anticipated by U.S. Patent No. 
5,888,833 to Serre et al. Because the human antigen of the '833 patent is the same as the antigen 
of Simon et al.. Applicants respectfiilly submit that Claims 6, 1 1 and 12 also are not anticipated 
for the reasons set forth above. Applicants accordingly request withdrawal of this rejection as 
well. 
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Claims 6 and 1 1-12 are rejected under the judicially created doctrine of double patenting 
over Claim 2 of U.S. Patent No. 5,888,833. Applicants respectfully request withdrawal of this 
rejection in view of the foregoing comments, namely, that the human antigen of the '833 patent 
differs from the antigen as claimed. 

In addition, new Claims 13-19 are presented herewith which are even further removed 
from Simon et al. and the '833 patent. Claims 13-18 are directed to a process for preparing an 
artificial antigen. In the process, a recombinant or synthetic polypeptide consisting of a filaggrin 
unit or fragment thereof of at least five consecutive amino acids is provided. At least one 
arginine residue of the polypeptide is replaced with a citruUine residue. Claim 19. is directed to a 
method for in vitro diagnosis of rheumatoid arthritis using such an artificial antigen. 

The rejections of record having been addressed in full in the foregoing, Applicants 
respectfully submit that this application is now in condition for allowance, which action is 
xespectfiiUy solicited. Should the Examiner have any questions regarding this matter, it is 
respectfully requested that she contact the undersigned at her convenience. 

It is not believed that extensions of time or fees for net addition of claims are required, 
. beyond those that may otherwise be provided for m documents accompanying this paper. . 
However, in the event that additional extensions of time are necessary to allow consideration of 
this paper, such extensions are hereby petitioned under 37 CFR § 1.136(a), and any fee required 
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therefore (including fees for net addition of claims) is hereby authorized to be charged to Deposit 
Account No. 16-0605. 

Respectfully submitted. 




Mehssa B. Pendleton 
Registration No. 35,459 
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Version with Markings to Show Changes Made : 

1 . (Twice amended) An artificial antigen which is specifically recognized by the 
antifilaggrin autoantibodies present in the serum of patients suffering fi-om rheumatoid arthritis, 
which consists of a recombinant or synthetic polypeptide [comprising] having at least one 
citruUine residue, wherein said polypeptide is derived fi-om a filaggrin unit selected firom 

SEP ED NO: 7 or a fragment thereof having at least 5 consecutive amino acids [residues,] 
comprising at least one [being an] arginine residue[, of a sequence derived fi"om that of a 
filaggrin unit, by replacing at least one arginine residue with a citrulline residue]. 

2. (Twice amended) The artificial antigen as claimed in claim 1, [which consists of a 
peptide comprising all or part of at least one sequence derived fi-om the group consisting of the 
sequence corresponding to amino acids] wherein said fragment of at least 5 consecutive amino 
acids of a filaggrin unit is selected from: 

fragment 144 to 314 of SEP ID NO: 7 or sub-fragments thereof comprising at least one 
arginine residue: [a human filaggrin unit,] and 

[the sequence corresponding to amino acids] fragment 76 to 144 of SEP ID NO: 7 or 
sub-fragments thereof comprising at least one arginine residue [a human filaggrin unit, by 
replacing at least one arginine residue with a citrulline residue]. 

3. (Twice amended) The artificial antigen as claimed in claim [2] i, [which consists 
of a peptide comprising all or part of at least one sequence derived from SEQ ID N0:3, by 
replacing at least one arginine residue with a citrulline residue] wherein said fragment of at least 
5 consecutive amino acids of a filaggrin unit is fragment 71*1 19 of SEP ID NO: 7 or sub- 
fragments thereof comprising at least one arginine residue . 



4. (Amended) The artificial antigen as claimed in claim 1, [which consists of a 
peptide comprising all or part of at least one sequence derived from one of the sequences] 
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r, r .t 1-- ' fl ' -' - '^" 

peptides SEQ ID NO: 3, SEQ ID NO: 5. SEQ ID NO: 6, .^^^^t^sM^SsmS^ 
[by replacing] at leaS one arginine residue [wilh a citmlline residue]. 

6 (Twice amended) An antigenic composition [for diagnosing the presence of 
autoanSbodies specific for rheumatoid in a biological sample], which contains [a. lea., 

one] an antigen as claimed in any one of claims 1 to 4, with the exclusion of compostttons wrth 
a structure identical to that of a preparation of isoforms of filaggrin which is purified from the 
human epidermis comprising a mixture of isoforms having a molecular weight of 40,000 
measured by SDS-PAGE and a pi ranging between 5.8 and 7.4. 



CO a similar extent as a CT base aination (submitted for 
publication). 
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Organization, Structure, and Polymorphisms of the Human Profilaggrin Gene* 
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fSSlv^SSl^Si I P™'«" ~'"f«n="t of kcratohyalin granules of mammalian epidermis. 

SSSr— ^^^^^ 



i/ilaggrins represent an important class of intermediate fi- 
lament-associatcd proteins (FFAPs) that function, at least in 

*Thc nucleic acid sequence in this paper has l)ecn submitted to Gen- 
Bank under Accession Number J02929. 

•To whom correspondence should be addressed at the Laboratory of 
Skin Biology. National Institute of Arthritis and Musculoskelcul and 
B«hS^D 20m '"^^""^"^^^^'^ B'"'***^"^ ^<^"^ '2N238. 

• Dermatology Branch. 

' Laboratory of Biochemistry. 



pan, in the aggregation of keratin intermediate filaments into 
an organized "keratin pattern" during terminal stages of 
normal differentiation in mammalian epidermis (Dale ci aL 
1978, l989;Stcincrt etaJ., 1981; Steinert & Roop. 1988). On 
the basis of dau from both protein chemical studies (Harding 
& Scott. 1983; Rcsing et aL, 1984, 1985) and more recent 
cloning experiments (Haydock & Dale, 1986; Rothnagcl ct . 
al.. 1987; Rothnagcl & Steinert, 1990; McKinlcy-Grant ct at. 
1989). niaggrins arc initially synthesized as large polyprotcin 
precursors ("profilaggrins") consisting of many protein repeats 



This article not subject to U.S. Copyright. Published 1990 by the American Chemical Society 



arranged in landcm and accumulatt .ii a nonfunciional 
phosphorylaicd form as F-kcraiohyalin granules laic in epi- 
dermal differentiation (Fisher et aL. 1987; Roihnagcl ct a!.. 
1987; \fcKinley-Grant ct aU 1989; Rcsing ci al., 1989; Steven 
ct aL. 1989; Rothnagcl & Stcincrt, 1990). Subsequently, this 
precursor is dcphosphorylatcd and proieolytically cleaved by 
excision of a short peptide "linker" sequence to release func- 
tional filaggrin molecules (Rcsing et al., 1984. 1985. 1989; 
Haydock & Dale. 1986, 1990: Rothnagcl ct al.. 1987; Roth- 
nagel & Stcineri. 1990; McKinlcy-Grani ct al.. 1939). 

In order to understand the expression and function of this 
protein in detail and to explore its putative involvement in 
keratinizing disorders of the epidermis, we have recently iso- 
lated a cDNA clone encoding one full repeat of the human 
proHlaggrin gene (McKinlcy-Grant et al.. 1989). The full- 
length repeat was shown to be 324 amino adds (972 bp), which 
includes a linker of perhaps only 7 amino acids of sequence 
FLYQVST; that is, human profilaggrin consists of a tandem 
array of filaggrin molecules of about 317 amino adds separated 
by the linker sequence. The properties of such a deduced 
sequence are indistinguishable from those of isolated human 
filaggrin. By in situ hybridization, expression of the gene is 
lightly regulated at the transcriptional level in the granular 
layer. Although we showed the human gene is localized to 
chromosome position Iq21, no further information on gene 
structure and organization is available. 

in this paper we have isolated and charaaerized both cDNA 
and genomic clones encoding the ends of the gene. This has 
enabled elucidation of the structure of the gene, the likely 
number of repeats, and the extent of the polymorphisms in it. 

.vfATERfAi^ AND Methods 

Molecular Biology Procedures, A human genomic library 
in EMBL-3, constructed from DNA isolated from a single 
placenta, kindly supplied by Dr. Frank Gonzales (National 
Cancer Institute, Bclhcsda, MD), was screened with the 
cDNA clone XKFIO, which contains human filaggrin coding 
sequences (McKinley-Grant et al., 1989). Three clones were 
plaque purified, and ibeir inserts were excised with Sal\. 
Filaggrin-positive fragments were subcloncd into pGEMOZ 
for preparation of DNA. Portions were further subcloncd into 
M13 mpl8 or mpl9 vectors for sequencing with either Sc- 
quenase 2 (U.S. Biochemical Corp.) or TacTrac (Promcga 
BioTcc) according to the manufacturer's specifications and 
with synthetic oligonucleotides as primers. Portions of these 
clones or synthetic oligonucleotides derived from them, cor* 
responding to 5'- or 3'-noncoding sequences, were used to 
rcprobe the original Xgill library (McKinley-Grant et al., 
1989) to find cDNA clones also bearing 5'- or 3'-scquences. 
Similarly, XHFIO was used to screen this library to isolate 
longer cDNA clones containing multiple filaggrin repeats. The 
first-round signals of greatest intensity were sized by Southern 
blotting (Rothnagcl ct a!., 1987; McKinley-Granl et al.. 1989) 
and the longest were plaque purified. Table I summarizes the 
genomic DNA and cDNA clones used to generate sequence 
information. 

DNA was obtained from a single placenta (Oncor Labs) 
or purified from 12 whole human foreskins (Maniatis et al 
1982). 

Computer Analyses of Sequences, Protein sequence hom- 
ologies, secondary structure prediction analyses, and nucleic 
add sequence analyses were performed on the University of 
Wisconsin sequence analysis software packages compiled by 
the Wisconsin Genetics Computer Group (Dcvereux et aL, 
1984) and by use of the IBI Pustell sequence analysis software 
(version 2, International Biotechnologies Inc.). 



Tabic I: Summary of Genomic DNA and cDNA Clones Used m 
Thti Work 

clone name 



I' 



sct)uencc location 



comments 



gene clones 
gAHFS 

cDNA clones 
AHF302 
AHF604 
AHFJ73 
AHF2:3 
AHFIO 

AHF4( 
AHFII4 
AHF294 
AHF336 



5'-cnd of gene 
J'-end of gene 

5'-cnd: bp 366-379 and 949-2076 

5 -end: bp 1376-2971 

5'-cnd: bp 1447-1989 

3'^nd: bp 2801-5732 

unknown coding region; 1243 bp 

unknown coding region: 2832 bp 
unknown coding region: 2325 bp 
unknown coding region; 1800 bp 
unknown coding region; 745 bp 



see Figure I 
sec Figure 2 

see Figure I 
sec Figure I 
sec Figure I 
see Figure 2 
McKincly-Crani 

ct al. (1989) 
data not shown 
data not shown 
data not shown 
data not shown 



Results 

fsolation of Clones for the J'- and 3 '-Ends of the Profi- 
laggrin Gene. cDNA clone XHFlO established in a previous 
paper (McKinley-Grant ct al., 1989) to encode a portion of 
the human profilaggrin mRNA was used as a probe to screen 
a human genomic library in EMBL-3. Three positive clones 
were identified and plaque purified to homogeneity. Their 
inserts were excised with Sail (which docs _noi cut within 
coding regions of the human gene), and the fragments which 
were filaggrin positive were as follows: gXHF5, 18 kbp; 
gXHFlS. 4.5 kbp; gXHF222, 7.7 kbp. Each of these inserts 
was successfully subcloncd into pGEMOZ for further mapping 
analyses. By use of the restriction enzymes HgiW and Xmal 
(which cut each filaggrin repeat once to yield a repeat frag- 
ment of 0.972 kbp). clones gAHFl8 and gAHF222 contained 
four full filaggrin repeats and clone gAHF5 contained two full 
filaggrin repeals, as well as bands of about 0.5, 2.5, and 16 
kbp, respectively, that did not hybridize to the filaggrin probe. 
These sequences presumably represent flanking regions of the 
gene. The 0.5-kbp piece used as a probe cross-hybridized with 
the 2.5- but not the 16-kbp pieces, indicating that gXHF5 
represented a different end of the gene from the others. We 
were unable to find any clones conuining larger numbers of 
filaggrin repeats, presumably because 5amHI, used in con- 
structing the genomic library, cuts each filaggrin repeat many 
times (McKinlcy-Grani ei a!., 1989). Subsequently, for DNA 
sequencing, the gXHF222 7.7-kbp piece was cut in half with 
Sad. The gXHF5 clone was also cut with PvuW to generate 
a 4.5-kbp piece carrying all of the filaggrin-positive sequences. 
In addition, the 0.972-kbp pieces obtained by Xmal digestion 
of both gXHF5 and gXHF222 were harvested. All of these 
fragmenu were subcloncd into M13 vectors for sequencing. 
It became clear that gXHF5 encoded the 5'-cnd (Rgurc I) 
and gXHF222 the 3'-end of the profilaggrin gene (Figure 2). 

isolation of cDNA Clones for the 5'- and S'-Ends of the 
Profilaggrin mRNA, Synthetic oligomers 60 bp long corre- 
sponding to nucleotides 1605-1664 (sec Figure I) at the 5'-end 
of the gene and nucleotides 5461-5520 (sec Figure 2) at the 
3'-end of the gene were used as probes to rescrcen a cDNA 
library in Xgtll prepared earlier (McKinley-Grani ct al.. 
1989). Of about I x 10* pfu screened, only three clones 
posiuve for the 5'-cnd and one clone for the 3'-end were found. 
These numbers arc far less than the total numbers of filaggrin 
clones in the library (about 2% of all plaques), suggesting that 
the ends of the mRNA have been substantially processed, as 
seems likely from Northern blots (McKinley-Grant et al.. 
1989). Clones XHF202 (!.145 kbp) and XHF373 (0.543 kbp) 
for the 5'-end and clone XHF223 (2.952 kbp) for the 3'-end 
were completely sequenced and are illustrated in Figures I and 
2, respectively. 
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FIGURE I : Sequences of the 5'-end and amino-tcrminal end of the human profilaggrin gene. The nucleic acid sequence of a portion of the 
gene clone gAHFS is aligned with the enure nucleic acid sequences of the three cDNA clones XHF202. XHF604. and AHF373. Only the 3' 
4.456 kbp of gAHF5 from a PvuW site (underlined at the beginning) 10 the cloning vector site of 5<2/I (underlined at the end) arc shown. The 
two different Xma\ fragments were ordered from the sequence of the intact gAHF5 sequence. The intron sequences in gAHF5 are presented 
in lower case Putative regulatory sequences of the CAT and TATA boxes and the cap site arc indicated. The three restriction enzyme sites 
Oral, fcoRV. and Sptl used to size the entire gene and the XmaX sites arc shown. The deduced amino acid sequences with variations arc 
indicated with the single-letter code and are numbered from the initiation codon C). The symbols (O) and (•) of the amino-terminal 40 residues 
mark likely a and d positions, respectively, that may form a coilcd-coi! a-hclix. 



Characterization of the 5 '-End of the Profilaggrin Gene. 
Figure I shows the sequences of the genomic clone gXHF5 and 
cDNA clones XHF202, XHF373. and XHF604 that encode 5' 
information of the gene. Several features are evident. First, 
all clones contain a unique in-frame ATG (at bp position 1477) 
lhat meets all of the criteria for a utilized initiating codon 
(Kozak. 1989). Their nucleotide sequences arc identical prior 
10 it and for the first 128 bp following it: in the next 1266 bp 
of overlapping sequences, there are 180 (14%) variations in 
nucleotide and 86 (20%) variations in amino acid sequence. 
The two different Xmal fragments identified by subcloning 
and sequencing in gXHF5 were ordered as shown in Figure 
I. Second, comparisons of gXHF5 and XHF202 reveal the 
presence of an intron of 570 bp in the gene that splices the 



5'-untranslated region (at bp 379), which meet the obligatory 
recognition sequence requirements for inirons (Green, 1986). 
Primer extension experiments to define the likely "cap* site, 
using the 60-bp oligonucleotide from bp 1605-1664 described 
above and up to 100 Mg of poIy(A)-enriched epidermal RNA, 
were inconclusive due to the likely processed nature of the 
human filaggrin mRNA (McKinley-Grant et al., 1989). 
However, there arc two possible cap sites at bp 203 and 261. 
These are preceded by potential *TATA" boxes at bp 163-185 
and "CAT* boxes at bp 2! or 90-100 that fulfill the char- 
aaeristics of funaional genes. Rnally, the available sequence 
data reveal several potential regulatory sequences such as the 
so-called epidermal-specific enhancer element and a rciinoic 
acid responsive clement (Blessing et al., 1987; Tseng & Green* 
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^J^^ carboxyl-icrminal end of the human profilaggrin gene. The nucleic acid sequence of a portion of Knc 

done gAHF222 is aligned w,th ihe enure nudcic add sequence of cDNA clone XHF223. The unsequenced gap of 1420 bp in gXHF223 corresponds 
10 ihe position f ihc two Xmal fragments that could not be ordered. The Sail sites at the beginning (cloning vector site) and end (Hanking 
gene sequences), hkcly polyadenylaiion signal sequences, and the middle Sad site used in subcloning and sequencing are underlined. The 
three restriction enzyme sit« Oral EcoRV, and Sptl used in sizing the gene and the Xmal sites are shown. The deduced amino add sequences 
with vanations arc indicated wuh the single-letter code and are numbered from the 5'-end of gXHF222, through the unsequenced gap to the 
overlap region with XHF223 to the termination codon (•). e s h 

1^88), but additional sequencing and other functional assays 
will be necessary to identify all such sequences and those likely 
to exist further upstream. 

1 The deduced amino acid sequences from the initiating codon 
reveal a conserved aliphatic-polar sequence for the first 40 
amino acids of no sequence homology to the filaggrin repeating 
«quenccs. Residues 41-71 reveal about 50% sequence hom- 
ology, while residues beyond. 71 arc highly homologous 
(>85%). The first FLYQVST sequence, which rcprcscnis the 
^nker region that is cleaved to release individual functional 
plaggrin molecules (McKinley-Grant ct al., 1989), occurs at 
j«idue 245; that is. the first portion of the gene encodes a 
pincated filaggrin repeat with an unusual amino-tcrminal end. 
t Analysis of the likely secondary structure reveals that the 



mt 40 residues have an a-helical conformauon. In searching 



both Gen Bank and NBRF sequence data banks, we found the 
sequences I-A-T-Y (residues 10-13) and L-L-E (residues 
27-29) occur elsewhere only in the coilcd-coiJ sequences of 
several IF proteins, including human keratin I (Johnson ct al.. 
1985; Stcinert et al., 1985). which ts coexprcsscd in this tissue 
with profilaggrin. Apart from this, the amino-terminal 40 
residues share little significant homology with any IF or other 
coiled-coil or-helical protein. However, this sequence possesses 
a weak hcptad pattern of the form {a-b-c-d-e-f-g)„ suggesting 
that it may form a coiled coil. In most esublishcd coiled-coil 
proteins, at least 70% of the a and d positions are occupied 
by residues with hydrophobic side chains (Conway & Parry, 
1988). In this case, 55% (6 of 1 1) of the a and d residues mcci 
this requirement. 

Residues 41-71, which have been less conserved, are likely 



Mcuse J15 -criSirTAKHLarHQSHsrYrv- 
FiCLRE y. Homology of the carboxyl-terminal sequences of human 
and mouse (Ro.hnagel et al.. 1987: Rothnagel I Steinert. IWO) 
prol.laggnni: (:) .denii.y: (.) homologous residues: (-) diletion 

to possess a folded structure due to the presence of several 
turns. 

Characterization of the 3'- End of the Profilaggrin Gene 
Figure 2 shows the sequences of the genomic clone g,\HF222 
and the cDNA clone AHF223 that encode J' information of 
the gene. Both clones possess the termination codon (ai bp 
5193) and the entire A-T-rich 3'-noncoding region Like the 
mouse proniaggrin gene (Rothnagel et a!.. 1987; Rothnaeel 
& Steinert. 1990). there are no introns in the coding end TTie 
nucleic acid sequences are identical beyond the last FLYOVST 
sequence (at bp 4138). suggesting that this part of the gene 
has been conserved. Prior to this, in the last complete filaggHn 
Tf'^ ^''P 62 (8%) variations in nu- 

cleotide and 30 variations (11%) in amino acid sequence in 
the overlap region. The Xrrtal fragments of gAHF222 were 
subcloned into M 13 and sequenced, and four different repeat 
sequences were found. Two repeats, corresponding to the fct 
and fourth of the intact clone gAHF222..were recognized and 
are shown m Figure 2. 

The deduced amino acid sequence of Figure 2 shows that 
in the last repeat the sequence deviates completely from fi- 
laggnn-like sequences after amino acid residue 1 594 While 
the carboxyl-terminal 137 residues have no homology with 
typical filaggrin repeat sequences, the last 23 residues share 
59-0 homology wuh the carboxyl-terminal end of mouse fi- 
aggrin. including a striking -Y-Y-Y-Y terminal sequence 
(Figure 3: Rothnagel et al.. 1 987; Rothnagel & Steinert, 1 990) 
Thus, like mouse profilaggrin. the human gene posssses a 
truncated and modified repeat at its carboxyl-terminal end 
The carboxyl-terminal 137 residues are highly charged (24 
basK; 13 acdic) and hydrophobic (23%). Secondary stnictural 
predictions suggest little or no organized structure, having 
frequent turns. * 

Sequence eplymprphisrm of the Human Profilaggrin Gene 
System. The data of Figures I and 2 have repealed conViJ! 
erable sequence variation between adjacent repeats on the same 
genomic clones. This was particularly evident in a sequencine 
reaction using gXHF222 and a synthetic oligonucleo^e primer 
corresponding to the linker region (bp 4138-4155 of Rgure 
2) which hybridizes to gXHF222 in five locations A se- 
quencing gel covering approximately 90-220 bp from the linker 
region (Figure 4) reveals that 12% of the base positions are 
heterogeneous. In order to understand these sequence poly- 
morphisms in niore detail, additional cDNA clones werrob- 
tained from the Agtl 1 library. The longest clones werelsolated 
(2.325 kbp) and XHF4I (2.832 kbT) and 
a third that was XHF223 (see Figure 2). (S<!veral other iJng 
dZt of/'oRI fragment, which had ran? 

dornly hgated together during preparation of the library, but 
provided more filaggrm sequence data.) Together with these 
new clon« and the cDNA and genomic DNA clones descnS 
above and previously (McKinley-Crani et al.. 1989) we are 
able to comp le a data base of sequence information on human 
filaggrm. including sequences from multiple individual persons 
and some wuh multiple adjacent repeau. Figure 5 shows a 

found h-,; °' ~'"P'""^ ^-^P^ts. we 

found that all repeats are precisely 972 bp (324 amino adds) 
long, except those repeats located at the 5'- and 3'-ends of the 
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mRNA (Rgures I and 2). In the full-length and partial 
'T""^!!^)^" gXHF222 clones from the same uk5- 
vidual. 100 of 324 (31%) of the residue positions are variaifc 
of which 14 (4%) vary more than twice (Rgure 5, capitaB}- 
Of these vanations. all but four can be accounted for by ««»^ 
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glc-basc changes in the codons utilized. When all available 
sequence dau are considered (Figure 5). 126 of 324 (39%) 
of the residue positions are variable, with several (a loul of 
32, 9%) more than twice. Most of the additional 26 variations 
would have required multiple mutations in the codons utilized. 
The longest conserved region occurs in the vidnity of the linker 
(residues 319-13). This corresponds to the region in the DNA 
sequence where several restriaion enzymes such as Hgt'Al cut 
each repeat, generating the superstoichiometric repeat on 
Southern blots (McKinley-Grant et al.. 1989). 

A comparison of the fulMcngth and partial repeats from 
the two genomic clones (Figure 5, capitals) reveals that 60% 
of amino acid sequence variations are conservative; only 7% 
involve exchanges between hydrophilic and hydrophobic res- 
idues, but 33% involve changes in charge. While the molecular 
masses of these repeats vary little (34 ± 0.2 kDa), their p/ vary 
more widely (8.3 ±1.1). Human filaggrin has been shown 
to consist of multiple isoelectric varianU, attributed to in- 
complete dephosphorylation or desimidauon (Harding Sc Scott, 
1983), but these data clearly show that another major reason 
for charge heterogeneity is sequence polymorphism. 

Size of the Fuli-Ungth Human Profilaggrin Gene, The 
dau of Figures I and 2 reveal the presence of Drah £coR V, 
and Spe\ restriction enzyme sites that occur only in the con- 
served 5'- and 3'-ends of the gene which pcnnit calculations 
of the size of the full-length gene. These calculati ns assume 
that the distances between restriction enzyme sites and the first 



FLYQVST linker sequence^ (bp 2212 of Figure I) and between 
the last FLYQVST linker sequence (bp 4 1 38 of Figure 2) and 
the restriction enzyme sites have been conserved, as our se* 
quence dau indicate. Surprisingly, the sample of geh mic 
DNA used, from a different source than used previously 
(McKinley-Grant et al.. 1989), yielded two bands of equal 
intensity with each enzyme of sizes 13.2 and 12.2, 13.0 and 
12.1, and 15.3 and 14.2 kbp, respectively (Figure 6A). This 
means that this DNA sample contains profilaggrin genes 
having 10 and 11 full filaggrin repeats, in addiuon to the 
partial and modified repeats at the 5'- and 3'-cnds. The ge- 
nomic DNA utilized previously (McKinlcy-Granl et aL, 1989) 
yielded a single band that corresponds to a gene with 12 full 
filaggrin repeats. These observations were explored further 
with DNA from another 12 individuals which was cut with 
Oral (Figure 6b). All samples contain either one band or iw 
bands of equal intensity of three size classes about I kbp apart 
and correspond to genes containing 1 1 only. 12 only. 10 and 
11,11 and 12, or 10 and 12 repeats. 

These apparent allelic forms of the human profilaggrin gene 
were further examined with DNA derived from many indi- 
viduals in several three-generation kindreds (CEPH cell lines; 
White et al., 1990) with no known involved keratinizing dis- 
orders of the skin and of several different racial and ethnic 
groups. When cut with £coRV, DNA from two kindred 
families (Figure 7), as well as 24 other families (dau not 
shown), revealed nly one or two bands in all cases, corre- 



FIGURE S: Slruciurc of Ihc human proftlaggrin gene containing io 
filaggrin repeats. S = Spel, which cuts only in the conserved nankine 
regions: H = HgiAU which cuts in the conserved linker region: F = 
phenylalanine of the first consensus residue of each repeat. The 
15.0 - — - ~ positions of the cap site, intron. initiating codon. termination codon, 

Z — . ZZZ— ^2^**.^ polyadcnylation signal sequences are shown. 

"~* . ^ 

individuals (26 of 40 CEPH families and 14 other individuals; 
a total of 44 shown in this paper), it is clear that the range 
in size of the normal profilaggrin gene within the human 
population is limited. 
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FIGURE 6: Size of the human profilaggrin gene. (A) Genomic DNA 
from a single source was cut with the three enzymes Oral, Sp^l and 
£roRV, elcarophorcscd for 3 days to maximize resolution, processed 
by Southern blotting, and probed with a coding probe (XHFIO: 
McKinley-Crant et al. (1989)). The sizes of the two bands in each 
case were measured with respect to high molecular weight markers 
(Bcthcsda Research Labs). With the Oral data for example, the 
number of repeats was calculated as follows: the distance from the 
proximal Oral site at the 5'-cnd to the first FLYQVST linker is 1.590 
kbp (Figure I ): the distance from the last FLYQVST linker to the 
proximal Oral site at the 3'-cnd is 1.169 kbp (Figure 2); the sizes 
of the Oral fragments shown here arc 13.4 and IZ4 kbp: the size of 
each filaggrin repealing unit is 0.972 kbp: the number of repeats is 
therefore (13.4 - 1.59 - LI69)/0.972 = 10.9 and (12.4 - 1.59 - 
I.I69)/U97: = 9.9. The numbers for £foRV and Spel calculated 
the same way are 1 1.0 and I O.I and 11. 1 and 9,9. respectively. (B) 
DNA from 1 2 individuals was digested with Oral and processed as 
above. The three levels of bands correspond to 10. 1 1, or 12 repeats. 
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FIGURE 7: Mendelian segregation of human profilaggrin size alleles. 
DNA from transformed lymphocytes of the several members of two 
three-generation kindred families (CEPH cell lines; White ei al.. 1990) 
was cut with froRV and charaacrized on Southern blots as in Figure 
6. In all cases, one or two bands corresepadtng to 10. 1 1 . or 1 2 repeats 
were obtained, which were segregated between the various family 
members as shown. 

spending in size to 10, 1 1, or 12 filaggrin repeats. The dis- 
tributions of the repeat numbers in the various family members 
(Figure 7) indicate normal Mendelian inheritance. Thus, 
based on the analyses of the DNA of more than 300 different 



Discussion 

Data reported in this paper on the isolation of portions of 
the human profilaggrin gene, as well as data from biath protein 
chemical (Harding & Scott, 1983; Resing et a!., 1984, 1985, 
. 1989; .McKinlcy-Grani et al.. 1989) and other recent cloning 
experiments (Haydock & Dale. 1986: Roihnagcl et a!., 1987; 
McKinlcy-Grant ci al., 1989; Rothnagel & Steinert, 1990), 
have now firmly established that filaggrins arc expressed from 
huge genes of relatively simple structure. The genes consist 
of several tandemly arranged polynucleotide repeats of 972 
bp (human; this paper and McKinley-G^int et al. (1989)]. 
750 bp (mouse; Rothnagel & Steinert, 1990). and 1272 bp 
(rat; Haydock & Dale, 1986) and arc devoid of introns in 
coding regions. Thus, the genes encode large polyprotein 
precursors consisting of numerous tandem filaggrin repeats. 
This paper provides details of the 5'- and 3'-cnds and thus 
of the structure of the entire human profilaggrin gene (Figure 
8). Even though the complete gene has not been isolated, by 
taking advantage of certain restriction enzyme sites that occur 
only in the conserved flanking regions, we are able to calculate 
the number of repeats in it. Whereas a sample of DNA 
obtained previously from one individual yielded only one band 
when cut with the enzymes Oral and EcoRV (McKinlcy- 
Grant et al., 1989). we now find in another single DNA sample 
that these enzymes and Spel generate two bands (Figure 6A). 
Further, analysis of DNA from an additional 1 2 foreskins and 
DNA from many rncmbers of 26 CEPH kindred families of 
several different racial and ethnic origin reveals one or tw 
bands with Oral (Figure 6) or £coRV (Figure 7). The precise 
I -kbp difference in size of the bands in different individuals 
(Figures 6 and 7), the conservation of the fianking sequences 
of the gene, and the multiplicity of sites for these three re* 
striciion enzymes (Figures I and 2) make it improbable that 
these 1-kbp variations in size can be due to mutations in all 
three restriction enzyme sites simultaneously. The most likely 
explanation of these results is that the profilaggrin genes in 
different individuals can contain 10, 1 1, or 12 full filaggrin 
repeats; that is, the human profilaggrin gene system is poly- 
morphic with respect to the numbers of repeats. These data 
may mean that there are multiple genes containing varying 
numbers of repeats within any one individual, although if this 
were the case, such multiple genes must be tightly linked to 
the Iq2l region (McKiniey-Grant et al.. 1989). It is more 
likely, however, that there is only one gene per haploid genome* 
but the two copies of the gene in any one individual can contain 
variable numbers of repeats due to simple allelic differcnces. 
This notion is further supported by the finding (Figure 7) that 
the different-sized bands corresponding to different numbers 
of repeats segregate in kindred families by normal Mendelian 
processes. These three allelic variants may have arisen by 
unequal mcioiic recombinations eariicr in evolution and have 



since been conserved. Even though* c represent more 

than 300 individuals of several racial and ethnic groups, we 
cannot exclude the possibility of additional allelic size variants 
in a wider population survey. 

One important conclusion of this finding is that it would 
appear that the formation of a normal terminally differentiated 
human epidermis is not critically dependent on the precise 
amount of functional filaggrin produced from the precursor 
gene: that is. to date we see a variation of 20% (10-12 repeats). 
The rationale for such variability is not yet clear. 

In our initial report on the human filaggrin system 
(McKinley-Grant ci al., 1989), we recognized the probability 
of sequence variation between neighboring repeats. In this 
paper we have compared the sequences of 1 1 different clones 
including clones containing multiple adjacent repeats and find 
(Figure 5) that all repeats arc precisely 972 bp (324 amino 
acid residues) long but display a bewildering array of sequence 
variations. Such variations mean that the human filaggrin 
system is doubly polymorphic in addition to variable numbers 
of repeats in profilaggrin, functional filaggrin also consists of 
a heterogeneous population of molecules of similar size but 
of considerable charge and sequence heterogeneity. There is 
as much variation between neighboring repeats on the same 
clone from the same individual as between repeats on difTerent 
clones from different individuab (Figures 4 and 5). So far. 
we have found 39% of the 324 amino acid positions per repeat 
arc variable (Figure 5). Our data base contains information 
from nine clones from two different foreskin cDNA libraries 
and two genomic clones, all obtained from individuals of 
similar ethnic origin. Accordingly, we expect more variations 
will appear when a larger portion of the human population 
is sampled. Nevertheless, most of the identified variations 
represent conservative changes. Few if any changes involve 
the appearance of a different type of amino acid that might 
be expected to significantly change the structural properties 
of the filaggrin molecules. Thus» although our data base is 
limited in size, it seems likely that generally only conservative 
substitutions are tolerated and that such changes are not 
randomly distributed in the normal human profilaggrin gene. 
Furthermore, our data show islands of tight sequence con- 
servation, which explains why some restriction enzymes cleave 
DNA regularly. The: most notable region is in the vicinity of 
the linker (residues 319-13, Figure 5). as might be expected 
since this is recognized by a common set of proteolytic pro- 
cessing enzymc(s). In contrast, mouse filaggrin repeat se- 
quences seem to have been highly conserved, yet the linker 
region is somewhat variable (Rothnagel & Steinert, 1990). 
Future work will be directed toward an understanding of the 
structural and functional significance of these sequence var- 
iations. 

We demonstrate here that the human profilaggrin gene 
contains an intron in the 5'-untransIated region. Interestingly, 
other genes expressed in mammalian epidermis such as invo- 
lucrin (Eckcrt & Green, 1988) and loricrin (D. Hohl and P. 
Stcinert. unpublished results) and epidermal derivatives such 
as trichohyalin (Rothnagel & Rogers, 1986; Retz ct al.. 1990) 
also possess simple gene structures. None of these genes 
contain inirons in coding portions, and all possess a single 
intron in their 5'-untransIated regions. Each of these genes 
encode proteins having peptide or. polypeptide repeats that 
display considerable sequence variations yet retain certain 
prominent structural motifs. The lack of introns within or 
between the repeats probably reflects the simple evolutionary 
processes of amplification and/or duplication involved in their 
formation. The reason for their sequence variations is not dear 



at this time. However, fact that each is expressed in a 
moribund tissue and ultimately functions in a dead cell to 
afford a barrier against the environment reminds us of the 
earlier view (Frascr ei al.. 1972) that such variations, providing 
they retain certain essential structural motif(s). arc tolerated 
because they retain no further effect on the life of the or- 
ganism. A further point of interest for future consideration 
is that most if not all of these proteins probably function in 
some way as intermediate filament-associated proteins, by 
interacting directly or indirectly with the keratin IFs of the 
various cell types (Stcinert & Roop, 1988). 

Examination of the sequences at the amino- and carbox- 
yl-terminal ends of human profilaggrin reveals the presence 
of modified repeals that either start with unusual sequences 
before merging into or end with unusual sequences in the midst 
of the "consensus" filaggrin repeats. Their structural and 
chemical properties are strikingly different from those of the 
filaggrin repeat sequences: (I) the amino-terminal sequence 
is a-helical. is likely to form or participate in the formation 
of a coiled coil, is strongly acidic, and is notably enriched in 
aromatic amino acids; (ii) the carboxyl-terminal sequence is 
strongly basic and also hydrophobic; (iii) whereas the filaggrin 
repeat sequences contain an average of 22 potential phos- 
phorylation sites per repeat (Rcsing et al., 1985, 1989; Stcinert, 
1988), the amino- and carboxyl-terminal ends contain 0 and 
2 such sites, respectively; (iv) the sequences appear to have 
been highly conserved (Figures I and 2). Although searches 
in data bases with the carboxyl-terminal sequences have re- 
vealed no similarities to other proteins (except with the car- 
boxyl-terminal end of mouse filaggrin; Figure 3), the amino- 
terminal sequence reveals modest homologies with certain 
keratin IF chains because of a potential to form a coiled-coil 
a- helical structure. Therefore, these sequences may serve an 
important role in the function of profilaggrin, distinct from* 
its content of several filaggrin repeats. The hydrophobic nature 
of the carboxyl-terminal sequences may aid in the proteolytic 
processing, but other functions, if any, will have to await 
further experiments. With respect to the amino-terminal 
sequences, we note similar a-helical sequences are present on 
other structural proteins, including procollagens (Bornsiein 
& Traub, 1980). Thus by analogy with the procollagens, we 
suggest the following two possibilities for the function of the 
amino-terminal sequences on human profilaggrin. They may 
aid in the accumulation of the profilaggrin in the epidermis 
by interaction with coiled-coil sequences on the adjacent 
keratin IF. so as to in effea anchor the accumulating deposit 
of protein. Alternatively, this could be accomplished when 
two (or more) adjacent profilaggrin molecules associate by 
interaction of their coiled-coil sequences to form a macroscopic 
aggregate of protein. A third or concurrent function related 
to their hydrophobic nature may be to aid in proteolytic 
processing, as proposed for the carboxyl-terminal and linker 
regions.. 

Several authors have hitherto referred to the initial trans- 
lation product of this gene system as profilaggrin (Resing ct 
al., 1984, 1985. 1989; Dale ct aL. 1989; Haydock & Dale, 
1986). The use of this term now seems fulLy justified in view 
of the data described in this paper which clearly demonstrate 
the presence of propeptide sequences at the termini. 

In summary, we have characterized cDNA and genomic 
DNA clones encoding the ends of the hurnan profilaggrin gene, 
which provide novel information on the extraordinary poly- 
morphisms of this gene system and which will now permit more 
detailed studies on its expression and function in normal and 
abnormal epidermal differentiation. 
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- Profilaggrine : \0k 12 unites de filaggrine 

- Variabilite de sequence ( intra- et inter-individuelle) 

-Choix de la sequence consensus pour la synthese des 33 
peptides de 14 a 19 acides amines, extr^mites se chevauchant 
sur 5 a.a. 
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- Profilaggrine : 10^12 unites de filaggrine 

- Variabilite de sequence ( intra- et inter-individuelle) 

-Choix de la sequence consensus pour la synthese des 33 
peptides de 14 k 19 acides amines, extr^mites se chevauchant 
sut 5 a.a. 
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SEQUENCE LISTING 



<110> SERRE, guy 

GIRBAL-NEUHAUSER, elizabeth 
VINCENT, christian 
SIMON, michel 
SEBBAG, mireille 
DALBON, pascal 

JOLIVET-REYNAUD, COlette 

ARNAUD, michel 
30LIVET, michel 

<120> ANTIGENS DERIVED FROM FILAGGRINS AND THEIR USE 
FOR THE DIAGNOSIS OF RHEUMATOID ARTHRITIS 

<130> M3Pbvl067/l . 

<140> 09/254,032 
<141> 1999-04-26 

<150> PCT/FR97/01541 
<151> 1997-09-01 

<150> FR96/10651 
<151> 1996-08-30 

<160> 7 

<170> Patentin version 3.1 

<210> 1 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
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1067-1. ST2 5 

<220> 

<223> Primer for amplification of a human filaggrin unit. 
<400> 1 

ttcctatacc aggtgagcac teat 

<210> 2 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer for amplification of a human filaggrin unit. 

<400> 2 

agaccctgaa cgtccagacc gtccc 

<210> 3 

<211> 49 

<212> PRT 

<213> Homo sapiens 

<400> 3 

ser Thr Gly His Ser Gly Ser Gin His Ser His Thr Thr Thr Gin Gly 
1 5 10 15 

Arg Ser Asp Ala ser Arg Gly Ser ser Gly Ser Arg Ser Thr Ser Arg 
20 25 30 

Glu Thr Arg Asp Gin Glu Gin Ser Gly Asp Gly Ser Arg His Ser Gly 
35 40 45 

Ser 

<210> 4 

<211> 37 

<212> PRT 

<213> Homo sapiens 

<400> 4 

ser Gin Asp Arg Asp Ser Gin Ala Gin ser Glu Asp Ser Glu Arg Arg 

Page 2 



1067-1. ST2 5 
10. 



15 



Ser Ala Ser Ala Ser Arg Asn His Arg Gly Ser Ala Gin Glu Gin Ser 
20 25 30 

Arg Asp Gly Ser Arg 
35 

<210> 5 

<211> 14 

<212> PRT 

<213> Homo sapiens 



<210> 6 

<211> 14 

<212> PRT . 

<213> Homo sapiens 

<400> 6 

Glu Ser ser Arg Asp Gly Ser Arg His Pro Arg ser His Asp 
1 5 10 

<210> 7 

<211> 324 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> VARIANT 

<222> (3).. (3) 

<223> Y replaced by I 

<220> 

<221> VARIAhfT 

<222> (12).. (12) 



<400> 



Glu Gin Ser 
1 



Ala Asp Ser Ser Arg His Ser Gly Ser Gly His 
5 10 
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<223> 



E replaced by D 



<220> 

<221> VARIANT 

<222> (14) . . (14) 

<223> S replaced by A or T 

<220> 

<221> VARIANT 

<222> (17) . . (17) 

<223> R replaced by Q or W 

<220> 

<221> VARIANT 

<222> (18).. (18) 

<223> S replaced by A or T 

<220> 

<221> VARIANT 

<222> (19) . . (19) 

<223> G replaced by R, A or v 

<220> 

<221> VARIANT 

<222> (20) . . (20) 

<223> T replaced by P 

<220> 

<221> VARIANT 

<222> (23).. (23) 

<223> G replaced by R 

<220> 

<221> VARIANT 
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<222> (24).. (24) 

<223> G replaced by R 

<220> 

<221> VARIANT 

<222> (29) . . (29) 

<223> H replaced by R 

<220> 

<221> VARIANT 

<222> (31).. (31) 

<223> E replaced by D or Q 

<220> 

<221> VARIANT 

<222> (34).. (34) 

<223> R replaced by Q 

<220> 

<221> VARIANT 

<222> (38).. (38) 

<223> R replaced by G 

<220> 

<221> VARIANT 

<222> (41) . . (41) 

<223> T replaced by A 

<220> 

<221> VARIANT 

<222> (44).. (44) 

<223> E replaced by Q or p 

<220> 
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<221> VARIANT 

<222> (50).. (50) 

<223> H replaced by R 

<220> 

<221> VARIANT 

<222> (51).. (51) 

<223> G replaced by A 

<220> 

<221> VARIANT 

<222> (52).. (52) 

<223> H replaced by R 

<220> 

<221> VARIANT 

<222> (55).. (55) 

<223> S replaced by P 

<220> 

<221> VARIANT 

<222> (56).. (56) 

<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (57).. (57) 

<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (58).. (58) 

<223> G replaced by R 
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<220> 

<221> VARIANT 

<222> (61) . . (61) 

<223> Q replaced by H 

<220> 

<221> VARIANT 

<222> (63) . . (63) 

<223> S replaced by Y 

<220> 

<221> VARIANT 

<222> (64) . . (64) 

<223> H replaced by Y 

<220> 

<221> VARIANT 

<222> (65).. (65) 

<223> Y replaced by H 

<220> 

<221> VARIANT 

<222> (68) . . (68) 

<223> S replaced by L 

<220> 

<221> VARIANT 

<222> (69).. (69) 

<223> V replaced by A 

<220> 

<221> VARIANT 

<222> (70) . . (70) 

<223> D replaced by N 
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<220> 

<221> VARIANT 

<222> (71).. (71) 

<223> R replaced by S 

<220> 

<221> VARIANT 

<222> (72).. (72) 

<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (75).. (75) 

<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (78).. (78) 

<223> H replaced by Q 

<220> 

<221> VARIANT 

<222> (80) . . (80) 

<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (84) . . (84) 

<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (92) . . (92) 
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<223> H replaced by R 
<220> 

<221> VARIANT 

<222> (94) . . (94) 

<223> T replaced by S, Q or H 

<220> 

<221> VARIANT 

<222> (100) . . (100) 

<223> A replaced by T 

<220> 

<221> VARIANT 

<222> (103) . . (103) 

<223> Q replaced by E or T 

<220> 

<221> VARIANT 

<222> (105).. (105) 

<223> R replaced by H 

<220> 

<221> VARIANT 

<222> (106) . . (106) 

<223> N replaced by D 

<220> 

<221> VARIANT 

<222> (107) . . (107) 

<223> Q replaced by E or D 

<220> 

<221> VARIANT 
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<222> (114).. (114) 

<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (120) . . (120) 

<223> R replaced by H 

<220> 

<221> VARIANT 

<222> (126) . . (126) 

<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (127) . . (127) 

<223> R replaced by W or Q 

<220> 

<221> VARIANT 

<222> (129) . . (129) 

<223> D replaced by E 

<220> 

<221> VARIANT 

<222> (132) . . (132) 

<223> R replaced by G 

<220> 

<221> VARIANT 

<222> (135).. (135) 

<223> Q replaced by A 



<220> 
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<221> VARIANT 

<222> (136).. (136) 

<223> V replaced by A or S 

<220> 

<221> VARIANT 

<222> (137).. (137) 

<223> G replaced by V 

<220> 

<221> VARIANT 

<222> (140). .(140) 

<223> E replaced by Q or D 

<220> 

<221> VARIANT 

<222> (142) . . (142) 

<223> S replaced by A or E 

<220> 

<221> VARIANT 

<222> (144).. (144) 

<223> P replaced by S 

<220> 

<221> VARIANT 

<222> (146) . . (146) 

<223> T replaced by R 

<220> 

<221> VARIANT 

<222> (149) . . (149) 

<223> N replaced by R 
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<220> 

<221> VARIANT 

<222> (150) . . (150) 

<223> Q replaced by W 

<220> 

<221> VARIANT 

<222> (154) . . (154) 

<223> F replaced by v 

<220> 

<221> VARIANT 

<222> (158).. (158) 

<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (161) . . (161) 

<223> Q replaced by E 

<220> 

<221> VARIANT 

<222> (162) . . (162) 

<223> G replaced by A 

<220> 

<221> VARIANT 

<222> (163) . . (163) 

<223> H replaced by Q or Y 

<220> 

<221> VARIANT 

<222> (164) . . (164) 

<223> S replaced by P 
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<220> 

<221> VARIANT 

<222> (170) . . (170) 

<223> w replaced by R or H 

<220> 

<221> VARIANT 

<222> (172).. (172) 

<223> G replaced by A or E 

<220> 

<221> VARIANT 

<222> (179) . . (179) 

<223> H replaced by R 

<220> 

<221> VARIANT 

<222> (182).. (182) 

<223> A replaced by S 

<220> 

<221> VARIANT 

<222> (183) . . (183) 

<223> Q replaced by R or W 

<220> 

<221> VARIANT 

<222> (186) . . (186) 

<223> S replaced by L 

<220> 

<221> VARIANT 

<222> (189) . . (189) 
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<223> 



G replaced by V 
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<220> 

<221> VARiArrr 

<222> (194).. (194) 

<223> R replaced by G, S or T 

<220> 

<221> VARIANT 

<222> (197) . . (197) 

<223> Q replaced by H, D or E 

<220> 

<221> VARIANT 

<222> (200) . . (200) 

<223> R replaced byT 

<220> 

<221> VARIANT 

<222> (202) . . (202) 

<223> G replaced by S 

<220> 

<221> VARIANT 

<222> (204) . . (204) 

<223> G replaced by R or v 

<220> 

<221> VARIANT 

<222> (205) . . (205) 

<223> H replaced by Q or D 



<220> 
<221> 



VARIANT 
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<222> 
<223> 



(207). .(207) 
A replaced by s 
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<220> 

<221> VARIANT 

<222> (208) . . (208) 

<223> D replaced by E or P 

<220> 

<221> VARIANT 

<222> (209) . . (209) 

<223> S replaced by v 

<220> 

<221> VARIANT 

<222> (210) . . (210) 

<223> S replaced by R or Q 

<220> 

<221> VARIANT 

<222> (211) . . (211) 

<223> R replaced by S, D or Q 

<220> 

<221> VARIANT 

<222> (212) . . (212) 

<223> Q replaced by D or S 

<220> 

<221> VARIANT 

<222> (215).. (215) 

<223> T replaced by R 



<220> 
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<221> VARIAhfT 

<222> (216).. (216) 

<223> R replaced by H 

<220> 

<221> VARIANT 

<222> (217).. (217) 

<223> H replaced by A 

<220> 

<221> VARIANT 

<222> (218).. (218) 

<223> T replaced by A or E 

<220> 

<221> VARIANT 

<222> (219) . . (219) 

<223> Q replaced by E 

<220> 

<221> VARIANT 

<222> (220).. (220) 

<223> T replaced by N or S 

<220> 

<221> VARIANT 

<222> (223).. (223) 

<223> G replaced by R 

<220> 

<221> VARIANT 

<222> (224) . . (224) 

<223> G replaced by R 
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<220> 

<221> VARIANT 

<222> (226) . . (226) 

<223> A replaced by T 

<220> 

<221> VARIANT 

<222> (230).. (230) 

<223> H replaced by Q 

<220> 

<221> VARIANT 

<222> (236). .(236) 

<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (237).. (237) 

<223> A replaced by P 

<220> 

<221> VARIANT 

<222> (239).. (239) 

<223> D replaced by E 

<220> 

<221> VARIANT 

<222> (244) . . (244) 

<223> G replaced by H 

<220> 

<221> VARIANT 

<222> (245).. (245) 

<223> H replaced by Y or Q 
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<220> 

<221> VARIANT 

<222> (255).. (255) 

<223> S replaced by A 

<220> 

<221> VARIANT 

<222> (256).. (256) 

<223> G replaced by A 

<220> 

<221> VARIANT 

<222> (257).. (257) 

<223> I replaced by T 

<220> 

<221> VARIANT 

<222> (259).. (259) 

<223> H replaced by R 

<220> 

<221> VARIANT 

<222> (264) . . (264) 

.<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (267).. (267) 

<223> R replaced by S 

<220> 

<221> VARIANT 

<222> (269) . . (269) 
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<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (274).. (274) 

<223> S replaced by Y 

<220> 

<221> VARIANT 

<222> (275).. (275) 

<223> S replaced by R 

<220> 

<221> VARIANT 

<222> (280) . . (280) 

<223> S replaced by T 

<220> 

<221> VARIANT 

<222> (282).. (282) 

<223> N replaced by Q or S 

<220> 

<221> VARIANT 

<222> (288) . . (288) 

<223> D replaced by N 

<220> 

<221> VARIANT 

<222> (291) . . (291) 

<223> T replaced by S 



<220> 
<221> 



VARIANT 
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<222> (295).. (295) 

<223> S replaced by A 

<220> 

<221> VARIANT 

<222> (296) . . (296) 

<223> A replaced by G 

<220> 

<221> VARIANT 

<222> (297) . . (297) 

<223> H replaced by Q 

<220> 

<221> VARIANT 

<222> (298).. (298) 

<223> G replaced by R 

<220> 

<221> VARIANT 

<222> (301) . . (301) 

<223> G replaced by R 

<220> 

<221> VARIANT 

<222> (302) . . (302) 

<223> S replaced by P 

<220> 

<221> VARIANT 

<222> (303).. (303) 

<223> H replaced by R 



<220> 
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<221> VARIANT 

<222> (304) . . (304) 

<223> Q replaced by H 

<220> 

<221> VARIANT 

<222> (305).. (305) 

<223> Q replaced by E 

<220> 

<221> VARIANT 

<222> (306) . . (306) 

<223> S replaced by A 

<220> 

<221> VARIANT 

<222> (311) . . (311) 

<223> T replaced by A 

<220> 

<221> VARIANT 

<222> (314).. (314) 

<223> R replaced by Q 

<220> 

<221> VARIANT 

<222> (316).. (316) 

<223> R replaced by Q, A or G 

<220> 

<221> VARIANT 

<222> (317).. (317) 

<223> G replaced by E 
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<220> 

<221> VARIANT 

<222> (318).. (318) 

<223> R replaced by T, G or S 

<400> 7 

Phe Leu Tyr Gin val Ser Thr His Glu Gin Ser Glu Ser ser His Gly 
1 5 10 15 

Arg Ser Gly Thr Ser Thr Gly Gly Arg Gin Gly Ser His His Glu Gin 
20 25 30 

Ala Arg Asp Ser Ser Arg His Ser Thr Ser Gin Glu Gly Gin Asp Thr 
35 40 45 

lie His Gly His Pro Gly Ser Ser Ser Gly Gly Arg Gin Gly Ser His 
50 55 60 

Tyr Glu Gin Ser val Asp Arg Ser Gly His Ser Gly Ser His His Ser 
65 70 75 80 

His Thr Thr Ser Gin Gly Arg Ser Asp Ala ser His Gly Thr Ser Gly 
85 90 95 

Ser Arg Ser Ala Ser Arg Gin Thr Arg Asn Gin Glu Gin Ser Gly Asp 
100 105 110 

Gly Ser Arg His Ser Gly Ser Arg His His Glu Ala Ser Ser Arg Ala 
115 120 125 

Asp Ser Ser Arg His Ser Gin val Gly Gin Gly Glu Ser Ser Gly Pro 
130 135 140 

Arg Thr ser Arg Asn Gin Gly Ser Ser Phe ser Gin Asp ser Asp Ser 
145 150 155 160 

Gin Gly His Ser Glu Asp Ser Glu Arg Trp Ser Gly Ser Ala Ser Arg 
165 170 175 

Asn His His Gly Ser Ala Gin Glu Gin Ser Arg Asp Gly Ser Arg His 
180 185 190 

Pro Arg Ser His Gin Glu Asp Arg Ala Gly His Gly His Ser Ala Asp 
195 200 205 

ser Ser Arg Gin Ser Gly Thr Arg His Thr Gin Thr Ser Ser Gly Gly 
210 215 220 
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Gin Ala Ala Ser Ser His Glu Gin Ala Arg Ser Ser Ala Gly Asp Arg 
225 230 235 240 

His Gly Ser Gly His Gin Gin Ser Ala Asp Ser Ser Arg His Ser Gly 
245 250 255 

lie Gly His Gly Gin Ala Ser Ser Ala Val Arg Asp Ser Gly His Arg 
260 265 270 

Gly Ser Ser Gly Ser Gin Ala Ser Asp Asn Glu Gly His Ser Glu Asp 
275 280 285 

Ser Asp Thr Gin Ser val Ser Ala His Gly Gin Ala Gly Ser His Gin 
290 295 300 



Gin Ser His Gin Glu Ser Thr Arg Gly Arg Ser Arg Gly Arg Ser Gly 
305 310 . 315 320 



Arg Ser Gly. Ser 
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